Make source_semantic_models property accessible from a DataflowPlanNode #1218

tlento · 2024-05-16T21:45:32Z

This change ultimately adds the source_semantic_models property to
the DataflowPlan and adds a hook for enabling access to it from any
arbitrary DataflowPlanNode.

We currently have two use-cases for this, one in the cloud codebase
that needs the semantic model inputs for a dataflow plan, and the
upcoming predicate pushdown evaluation which needs the semantic model
inputs for a given DataflowPlanNode.

An earlier version of this change added the property directly to the
DataflowPlanNode, which would satisfy both use cases above. The issue
with having this property assigned directly to a DataflowPlanNode
is that the property might be considered both a node-level and graph-level
attribute, so it's not clear where to put the accessor.

The solution we came up with for this was to allow access to a DataflowPlan
DAG object built from the node, which would effectively encapsulate the
subgraph represented by the node and its ancestors. Then we can access
these subgraph properties through the DataflowPlan while making it clear
to the caller that what they are asking for is a subgraph-level, rather
than a node-level attribute.

tlento · 2024-05-16T21:45:43Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @tlento and the rest of your teammates on Graphite

github-actions · 2024-05-16T21:45:48Z

Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide.

courtneyholcomb

Looks good!! 🚀

courtneyholcomb · 2024-05-16T22:15:46Z

metricflow/dataflow/dataflow_plan.py

+        inside of each node, we make those properties of the DataflowPlan, and this node-level converter makes
+        such properties easily accessible.
+        """
+        return DataflowPlan(sink_nodes=(self,))


So simple 🤯

Right? It was borderline impossible before @plypaul merged #1205

courtneyholcomb · 2024-05-16T22:18:23Z

metricflow/dataflow/dataflow_plan.py

@@ -188,3 +208,24 @@ def __init__(self, sink_nodes: Sequence[DataflowPlanNode], plan_id: Optional[Dag
    @property
    def sink_node(self) -> DataflowPlanNode:  # noqa: D102
        return self._sink_nodes[0]
+
+    def __complete_subgraph(self, node: DataflowPlanNode) -> Sequence[DataflowPlanNode]:


Should this be a staticmethod?

Sure, I can do that.

courtneyholcomb

🚢 🚢 🚢

tlento · 2024-05-24T00:33:22Z

Merge activity

May 23, 5:33 PM PDT: @tlento started a stack merge that includes this pull request via Graphite.
May 23, 5:47 PM PDT: Graphite rebased this pull request as part of a merge.
May 23, 5:52 PM PDT: @tlento merged this pull request with Graphite.

This change ultimately adds the source_semantic_models property to the DataflowPlan and adds a hook for enabling access to it from any arbitrary DataflowPlanNode. We currently have two use-cases for this, one in the cloud codebase that needs the semantic model inputs for a dataflow plan, and the upcoming predicate pushdown evaluation which needs the semantic model inputs for a given DataflowPlanNode. An earlier version of this change added the property directly to the DataflowPlanNode, which would satisfy both use cases above. The issue with having this property assigned directly to a DataflowPlanNode is that the property might be considered both a node-level and graph-level attribute, so it's not clear where to put the accessor. The solution we came up with for this was to allow access to a DataflowPlan DAG object built from the node, which would effectively encapsulate the subgraph represented by the node and its ancestors. Then we can access these subgraph properties through the DataflowPlan while making it clear to the caller that what they are asking for is a subgraph-level, rather than a node-level attribute.

tlento requested review from courtneyholcomb and plypaul May 16, 2024 21:45

cla-bot bot added the cla:yes label May 16, 2024

courtneyholcomb approved these changes May 16, 2024

View reviewed changes

tlento force-pushed the make-source-semantic-models-available-from-nodes branch from c1faf9a to 909aabb Compare May 17, 2024 01:27

tlento force-pushed the use-pushdown-params-for-disabling-time-constraint-pushdown branch from dc94cfc to 31ee2df Compare May 22, 2024 01:42

tlento force-pushed the make-source-semantic-models-available-from-nodes branch from 909aabb to 9ef3244 Compare May 22, 2024 01:42

tlento requested a review from courtneyholcomb May 22, 2024 01:42

courtneyholcomb approved these changes May 23, 2024

View reviewed changes

tlento force-pushed the use-pushdown-params-for-disabling-time-constraint-pushdown branch from 31ee2df to 3a02ea7 Compare May 23, 2024 23:41

tlento force-pushed the make-source-semantic-models-available-from-nodes branch from 9ef3244 to 45cf01f Compare May 23, 2024 23:41

tlento mentioned this pull request May 23, 2024

Add semantic_model_origin property to LinkableElement interface #1230

Merged

tlento force-pushed the use-pushdown-params-for-disabling-time-constraint-pushdown branch from 3a02ea7 to 9769994 Compare May 24, 2024 00:40

Base automatically changed from use-pushdown-params-for-disabling-time-constraint-pushdown to main May 24, 2024 00:46

tlento added 5 commits May 24, 2024 00:47

Add changelog

3dbb0e9

Rename subgraph helper method

4adcbae

Make helper method static

18662b0

Namespace dataflow plan subgraphs by prefix

f0c6cb9

tlento force-pushed the make-source-semantic-models-available-from-nodes branch from 45cf01f to f0c6cb9 Compare May 24, 2024 00:47

tlento merged commit ba30c80 into main May 24, 2024
15 checks passed

tlento deleted the make-source-semantic-models-available-from-nodes branch May 24, 2024 00:52

tlento mentioned this pull request Jun 4, 2024

Add integration tests for filters against various join types #1240

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make source_semantic_models property accessible from a DataflowPlanNode #1218

Make source_semantic_models property accessible from a DataflowPlanNode #1218

tlento commented May 16, 2024

tlento commented May 16, 2024 •

edited

Loading

github-actions bot commented May 16, 2024

courtneyholcomb left a comment

courtneyholcomb May 16, 2024

tlento May 16, 2024

courtneyholcomb May 16, 2024

tlento May 16, 2024

courtneyholcomb left a comment

tlento commented May 24, 2024 •

edited

Loading

Make source_semantic_models property accessible from a DataflowPlanNode #1218

Make source_semantic_models property accessible from a DataflowPlanNode #1218

Conversation

tlento commented May 16, 2024

tlento commented May 16, 2024 • edited Loading

github-actions bot commented May 16, 2024

courtneyholcomb left a comment

Choose a reason for hiding this comment

courtneyholcomb May 16, 2024

Choose a reason for hiding this comment

tlento May 16, 2024

Choose a reason for hiding this comment

courtneyholcomb May 16, 2024

Choose a reason for hiding this comment

tlento May 16, 2024

Choose a reason for hiding this comment

courtneyholcomb left a comment

Choose a reason for hiding this comment

tlento commented May 24, 2024 • edited Loading

Merge activity

tlento commented May 16, 2024 •

edited

Loading

tlento commented May 24, 2024 •

edited

Loading